On the Consistency of Output Code Based Learning Algorithms for Multiclass Learning Problems

نویسندگان

  • Harish G. Ramaswamy
  • Balaji Srinivasan Babu
  • Shivani Agarwal
  • Robert C. Williamson
چکیده

A popular approach to solving multiclass learning problems is to reduce them to a set of binary classification problems through some output code matrix: the widely used one-vs-all and all-pairs methods, and the error-correcting output code methods of Dietterich and Bakiri (1995), can all be viewed as special cases of this approach. In this paper, we consider the question of statistical consistency of such methods. We focus on settings where the binary problems are solved by minimizing a binary surrogate loss, and derive general conditions on the binary surrogate loss under which the one-vs-all and all-pairs code matrices yield consistent algorithms with respect to the multiclass 0-1 loss. We then consider general multiclass learning problems defined by a general multiclass loss, and derive conditions on the output code matrix and binary surrogates under which the resulting algorithm is consistent with respect to the target multiclass loss. We also consider probabilistic code matrices, where one reduces a multiclass problem to a set of class probability labeled binary problems, and show that these can yield benefits in the sense of requiring a smaller number of binary problems to achieve overall consistency. Our analysis makes interesting connections with the theory of proper composite losses (Buja et al., 2005; Reid and Williamson, 2010); these play a role in constructing the right ‘decoding’ for converting the predictions on the binary problems to the final multiclass prediction. To our knowledge, this is the first work that comprehensively studies consistency properties of output code based methods for multiclass learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Solving Multiclass Learning Problems viaError - Correcting Output

Multiclass learning problems involve nding a deenition for an unknown function f (x) whose range is a discrete set containing k > 2 values (i.e., k \classes"). The deenition is acquired by studying collections of training examples of the form hx i ; f (x i)i. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decision-tree algorit...

متن کامل

Error-Correcting Output Codes: A General Method for Improving Multiclass Inductive Learning Programs

Multiclass learning problems involve nding a deeni-tion for an unknown function f (x) whose range is a discrete set containing k > 2 values (i.e., k \classes"). The deenition is acquired by studying large collections of training examples of the form hx i ; f (x i)i. Existing approaches to this problem include (a) direct application of multiclass algorithms such as the decision-tree algorithms I...

متن کامل

Solving Multiclass Learning Problems via Error-Correcting Output Codes

Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k > 2 values (i.e., k \classes"). The de nition is acquired by studying collections of training examples of the form hxi; f(xi)i. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decision-tree algorithms C...

متن کامل

Design and Analysis of Consistent Algorithms for Multiclass Learning Problems

We consider the broad framework of supervised learning, where one gets examples of objects together with some labels (such as tissue samples labeled as cancerous or non-cancerous, or images of handwritten characters labeled with the correct character in a-z), and the goal is to learn a prediction model which given a new object, makes an accurate prediction. The notion of accuracy depends on the...

متن کامل

Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers

We present a unifying framework for studying the solution of multiclass categorization problems by reducing them to multiple binary problems that are then solved using a margin-based binary learning algorithm. The proposed framework unifies some of the most popular approaches in which each class is compared against all others, or in which all pairs of classes are compared to each other, or in w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014